Evaluating Automation Strategies in Language Documentation

نویسندگان

  • Alexis Palmer
  • Taesun Moon
  • Jason Baldridge
چکیده

This paper presents pilot work integrating machine labeling and active learning with human annotation of data for the language documentation task of creating interlinearized gloss text (IGT) for the Mayan language Uspanteko. The practical goal is to produce a totally annotated corpus that is as accurate as possible given limited time for manual annotation. We describe ongoing pilot studies which examine the influence of three main factors on reducing the time spent to annotate IGT: suggestions from a machine labeler, sample selection methods, and annotator expertise.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Related Factors in Medical Records Documentation Quality and Presenting Solutions from Managers' and Physicians' Viewpoints Occupied in Hospitals Affiliated to Kashan University of Medical Sciences

Introduction: Medical record documentation is an important legal and professional requirement for all health professionals which ensures holistic patient care presented to him. The aim of this study was to deter-mine factors affecting documentation quality from the viewpoints of managers and physicians in Kashan University of Medical Sciences and present solutions. Methods: This descriptive cr...

متن کامل

Evaluating the Automation of Architecture-based Design Activities

Reproduction or issue to third parties in any form whatsoever is not permitted without authority from the proprietors

متن کامل

A Pattern Language of Black-Box Test Design for Reactive Software Systems

Patterns have been successfully applied in software development to improve the development process, by facilitating reuse, communication and documentation of sound solutions. However, the testing domain is yet to benefit from a similar approach. This although, with the growing complexity of test automation solutions, identifying and instrumenting patterns in test design to facilitate reuse appe...

متن کامل

Identifying Style Awareness, Indirect Strategy Use and Preferences of Turkish Student Teachers of English

Styles and strategies are among the fundamental issues to be investigated in the language classroom in order to monitor learning process of language learners and to increase their awareness levels. Research on learner style and strategies suggests that a certain degree of awareness on these issues helps both learners and teachers distinguish between the weak and strong aspects of the learning p...

متن کامل

Gujarati Character Identification: A Survey

English Character Recognition techniques have been studied extensively in the last two decades and it gain unbelievable high progress and success ratio. But for regional languages these are still emerging and their success ratio is very poor. In Gujarat, there are thousands of people who can speak, write and understand only Gujarati language. Rapid growing computation may increase Indian CR met...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2009